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ABSTRACT 



A safe general purpose virtual machine computing system 
having a general purpose memory protection model that is 
hardware architecture and programming language indepen- 
dent. The safe general purpose virtual machine computing 
system is software based to facilitate operation on hardware 
architectures that otherwise would prevent the exchange and 
successful execution of mobile code programs from one 
computer system to another. The safe general purpose virtual 
machine computing system also facilitates generating Byte- 
code Reduced Instruction Set Computer (BRISC) com- 
pressed mobile code that can be compiled or translated into 
executable code very quickly in addition to being compact 
for transmission purposes, and that is prevented from 
accessing unauthorized memory locations due to Software 
Fault Isolation tedmiques implemented in the code. 

38 Claims, 7 Drawing Sheets 
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SAFE GENERAL PURPOSE VIRTUAL purpose victual machine may in fact be "safe", however the 

MACHINE COMPUTING SYSTEM safety is secured at the expense of a design that operates with 

certain types, but not all types, of hardware architectures. A 

CROSS REFERENCE TO RELATED limited purpose virtual machine is also considered "safe" 

APPLICAnON 5 because it is designed to operate with certain types, but not 

all types, of programming languages. In other words, 

This application is a Coatinuation-In-Part of U.S. patent although the limited purpose virtual machine is "safe", it is 

application Set No. 08/566,613 filed Dec. 4, 1995 now U.S. not wilely available to any general purpose computing 

Pat. No. 5,761,477, issued Jun. 2, 1998, titled "Methods For environment because the limited putpose virtual machine 

Safe and Ef&cient Implementations of Virtual Machines" relies on the existence of certain hardware architectural 

which is hereby incorporated by reference to the same extent features and/or programming language features to guarantee 

as if fiilly set forth herein. "safe" program operation. Absent the presence of the certain 

hardware and/or programming language features, the limited 

FIELD OF THE INVENTION purpose virtual machine can not guarantee "safe" program 

execution and may not be able to support program execution 
The present invention relates to virtual machine 15 at all. 
implementations, and in particular to a safe general purpose Examples of limited purpose virtual machine implemen- 
virtual madiine that generates optimized virtual machine tations thai rely on hardware enforcement of a memory 
computer programs that are executable in a general purpose protection model include, but are not limited to, the Inter- 
memory protection model environment and are compact and national Busiiiess Machines Virtual Machine/370 that is a 
hardware architecture and programming language indepen- 20 single user virtual machine implementation, and the IVY 
dent. machine that is a shared distributed memory virtual machine 

implementation. Both of the above limited purpose virtual 

PROBLEM madiines implementations are undesirable because they 

require specific hardware features and/or operating system 
A virtual machine is a metaprogram more generically ^ features to guarantee a "safe" program environment and 
known as an operating system. Theoretically, the operating such specific features do not exist in all computing envi- 
system environment of a virtual machiue facilitates gener- ronments presently in operation today, 
ating and/or executing virtual machine computer programs Examples of limited purpose virtual machine implemen- 
that lade hardware architecture dependencies and/or pro- tations that rely on programming language enforcement of a 
gramming language dependencies. Thus, one advantage of a memory protection model include, but are not limited to, 
virtual machine is that its operational semantics remain high level syntax enforcement at compile-time as seen in 
constant from one computer program to the next regardless Fortran and PLyl compilers, type-safe language restrictions 
of the origin or operating requirements of any one computer as seen in Java by Sun Microsystems or Telescript by 
program. However, realizing a true virtual machine com- General Magic Corporation, eliminating support for general- 
puting environment among a collection of fundamentally purpose or global pointers as in the LISP language, imple- 
incompatible computers, so that a single computer program menting specialized code sequences to access shared objects 
will function similarly on any one of the otherwise incom- as in the Emerald language, and specialized language filters 
patible computers, is a persistent problem in the computing or interpreters that detect undesirable program activities in 
field. interpreted programming languages and in general purpose 
One key component in the operational semantics of a 40 compiled languages. However, none of the above software 
virtual machine is the virtual machine's memory model, based techniques, whether viewed individually or in 
otherwise known as the memory protection model. A combination, directly or completely implements a safe gen- 
memory model is the scheme in which a virtual machine eral purpose virtual machine based on a general purpose 
manages its computer memory. Memory models are impor- memory protection model that is viable for use in applica- 
tant because they define what memory locations in a given 45 tions such as an Internet browser. Further, each of the above 
program space are accessible by a given computer program. techniques presents unique performance problems and/or 
Implementing a virtual machine in a manner that rigorously relies on specialized hardware or operating system support 
enforces such a memory model is considered a "safe'' that does not exist across all computing environments pres- 
system that runs ''safe" programs because something about ently in use today. 

the implementation prevents undesirable memory access 50 Additional factors that make limited purpose virtual 
activity by computer programs. For example, generally machines undesirable include delayed register allocation, 
speaking a "safe** virtual machine is one that prevents and large high level byte-code transport modules. Delayed 
unchecked memory access by an application program while register allocation means that the virtual machine does not 
an "unsafe" virtual machine is one that allows application anticipate register usage prior to the identification of a target 
program access to memory locations that are occupied by 55 machine. Delayed register allocation results in delaying final 
the operating system itself, thereby leaving open the poten- conversion of a virtual machine program until the program 
tial to compromise the operating system's ability to provide reaches its native machine destination. This conversion 
an operational environment for any application program at delay translates into a slower response time for a user that is 
all. Compromising the operating system itself typically downloading a program from a source across a network, 
results in the complete operational failure of the operating Large high level byte-code transport modules means that the 
system and the application program. modules are easily decompiled because they are in a byte- 
Examples of virtual machine memory model types that code format that is substantially equivalent to source code, 
are tiie basis for a "safe" virtual machine implementation and by definition source code is a bulky format for network 
include, but are not limited to, a restricted memory model transport purposes due to repetitive characters and uncom- 
and a general purpose memory model. A restricted memory 65 pressed white space. 

model is the basis for a limited purpose virtual machine, also For these reasons, limited purpose virtual machines are 

known as a special purpose virtual machine. A limited undesirable in view ofthe specific hardware architecture and 
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programming language dependencies, and the wide variety 
of alternative hardware architectures and/or programming 
languages that exist in present day computing environments, 
A general purpose memory model is the basis for a 
general purpose virtual machine. The general purpose ^ 
memory model defines the scope of memory location acces- . 
sibility in terms of the type of program requesting the access 
and the permissions held by the requesting program. For 
example, a first computer program operating with a first 
permission level might have free access to a first set of ^0 
memory locations while a second computer program oper- 
ating with a second permission level might have unrestricted 
access to all memory locations. In addition, enforcing the 
general purpose memory model of a general purpose virtual 
machine allows the general purpose virtual machine to 
execute computer programs that originated in any standard 
programming language and run on any hardware architec- 
ture. 

However, implementing a true general purpose virtual 
machine that is hardware architecmre and programming 
language independent, remains a long standing and unat- 
tained goal in the computer industry, particularly in critical 
computing environments including, but not limited to, data- 
base systems, operating systems, and distributed computing. 
For example, a true general purpose virtual machine is ^ 
desirable in distributed computing and network environ- 
ments given the proliferation of Local Area Networks 
(LANs), Wide Area Networks (WANs), and the recent 
popularity of the Internet's World Wide Web (WWW) and 
its accompanying browsers, where users want to distribute 
mobile code" from one computer to the next without concern 
for different hardware architecture dependencies and/or pro- 
gramming language dependencies. A similar need exists in 
distributed database environments where general read/write 
access is required by multiple incompatible users or program 
fragments are stored and retrieved by multiple users having 
otherwise incompatible hardware architectures, operating 
systems, and programming language dependencies. 

For the reasons stated above, there is a long felt need for 
a general purpose virtual machine supported by a general 
purpose memory protection model, that eliminates opera- 
tional program dependencies on specific hardware architec- 
tures and/or programming languages, yet is efficient in terms 
of implementation, maintenance, and mn time operation. 

SOLUTION 

The above identified problems are solved and an advance- 
ment made in the field by the safe general purpose virtual 
madiine computing system of the present invention. The 50 

safe general purpose virtual machine computing system 
directly addresses the primary scalability factors that include 
the ability to generate an executable general purpose virtual 
machine computer program that is significantly reduced ia 
size and free of hardware architecture and/or programming 55 
language dependencies, in addition to the abUity to convert 
or translate from source code to native machine instructions 
quickly so that a general purpose virtual machine computer 
program can execute in a safe software based virtual 
machine environment that is free of hardware, programming 60 
language, and/or memory protection dependencies. More 
particularly, a virtual machine computer program generated 
pursuant to the present invention can be transmitted more 
transparently and quickly across a network due to the 
strategically compressed code, and the transmitted program 65 
can be compiled into safe efficient code faster due to the 
pre-allocation of registers, called explicit register allocation, 
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at the time the program is made safe prior to transmitting the 
program. Further, code obscurity results from explicit reg- 
ister allocation and compressing the module being transmit- 
ted across the network due to the compilation process so that 
the virtual machine computer program in its object form is 
stripped of symbols and bears little resemblance to its 
original source code. Finally, the compactness of the com- 
piled program is due in large part to a Bytecode Reduced 
Instruction set Computer (BRISC) format that uses variable 
length instruction operation codes to save space. BRISC 
variable length instructions use special instruction encod- 
ings for commonly used operation codes and combinations 
of operation codes to limit the overall size of the object form 
of a compiled virtual machine computer program. 

Generating an executable general purpose virtual machine 
computer program includes, but is not limited to, examining 
the input program to determine the general safety of the 
program and generating a safe sequence of 'machine execut- 
able instructions from the foreign source program using a 
Software-based Fault Isolation (SFI) technique if the safe 
sequence has not already been generated. The SFI technique 
includes compiling the foreign source program into a 
generic virtual machine program, identifying unsafe ones of 
the generic virtual machine program instructions that are 
capable of accessing a memory address other than the 
memory addresses defined by the assigned memory access 
permission, replacing symbolic references with register 
references, and modifying the generic virtual machine pro- 
gram into a safe virtual machine program to prevent access 
to memory addresses other than the memory addresses 
defined by the assigned memory access permission. The safe 
virtual machine program can then be compiled and option- 
ally compressed using BRISC instruction formats for stor- 
age in a persistent memory and/or transmitted to another 
computing device for execution. 

Executing a general purpose virtual machine computer 
program in a safe virtual machine environment includes, but 
is not limited to, decompressing the virtual machine 
program, verifying that the virtual machine program is safe 
and free of hardware and/or programming language 
dependencies, and executing the virtual machine program in 
a general purpose memory protection model environment. 
The safe general purpose virtual machine computing system 
further includes assigning a memory access permission to a 
foreign source program and generating a safe sequence of 
machine executable instrucUons from the foreign source 
program using SFI techniques if the safe sequence has not 
akcady been generated. The safe general purpose virtual 
machine computing system further includes executing the 
safe virtual machine program within the memory addresses 
defined by the assigned memory access permission. Execut- 
ing a safe virtual machine program within the memory 
addresses defined by the assigned memory access permis- 
sion includes generating a program execution interrupt to 
trap the virtud machine program at the time the virtual 
machine program attempts to access a memory address other 
than the memory addresses defined by the assigned memory 
access permission. In the preferred embodiment, the safe 
general purpose virtual machine is implemented in the 
context of an Internet browser used to transfer virtual 
machine computer programs to and from local and remote 
virtual machines in the network. 

An alternative embodiment of the safe general purpose 
virtual machine computing system includes optimizing the 
virtual machine program by minimizing the number of 
modifications required in the safe virtual machine program. 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 illustrates an example of a general purpose com- 
puter system in block diagram form that can make use of the 
safe general purpose virtual machine computing environ- 
ment; 
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FIG. 2 Ulustrates a typical hardware architecture for a Memory (RAM) that contains the relevant portions of an 

processor complex in block diagram fonn; operating system process 240, free memory area(s) 243, and 

FIG. 3 illustrates a virtual machine architecture in block typically a plurality of non-system processes or programs as 

diagram form; illustrated by program A 241 and program B 242, The 

FIG. 4 iUiBttates an overview of operational steps for the ^ °f !f !!"°'y allocated to pro-am A 241 and 

safe gpneral puipose virtual machine computing environ- Progf^im B 242 include enough memory for the program 

ment in flow diagram form; '^^^^^ "^^'^S. memory required for normal opera- 

„ .„ , , „ tion. Either or both program A 241 and program B 242 can 

HG 5 Illustrates the operaUonal detads m flow diagram ^ ^^^^-^ ^^^^ ^he extent that the process can be 

form for determmmg if a program is safe; downloaded from another computer system from across a 

FIG. 6 illustrates the optional program optimizing opera- network, 

tional details in flow diagram form; Referring to FIGS. 1 and 2 in combination, it is important 

FIG. 7 illustrates the operational details for generating a to note that absent a virtual machine implementation even 

safe virtual machine program in flow diagram form; and with a restricted memory protection model, a computer 

FIG. 8 illustrates the load and execute operational details 15 system 100 is vulnerable to undesirable program activity, 

for a safe virtual machine program in flow diagram form. For example, a foreign program can be downloaded directly 

or indirectly from another computer system by way of 

DETAILED DESCRIPTION network connection 116 into the memory 235 of processor 

Computing Environment Example — FIG. 1 102. If the operating system 240 or processor architecture 

FIG. 1 illustrates a block diagram hardware architecture 20 102 is not compatible with the foreign program B 242 for 
example of a computer system 100 that can host the safe example, then the foreign program may not be executable at 
general purpose virtual machine computing system of the all by processor 102. Alternatively for example, if foreign 
present invention. The computer system 100 can be used to program B 242 is executable but is allowed to access 
perform inter-network or intra-network exchanging and locations in memory 235 occupied by the operating system 
executing of mobile code with other computer systems 25 in partition 240, then the computing environment for corn- 
across a network. Appropriately translated program instruc- puter system 100 is vulnerable to an unrecoverable system 
tions are executable on processor 102 of computer system error. Less catastrophic but no less undesirable failures can 
100. Processor 102, also known as the Central Processing occur if foreign program B 242 is allowed to access loca- 
Unit (CPU), stores and/or retrieves program instructions tions in memory 235 occupied by another process such as 
and/or data from memory devices that include, but are not 30 process A in partition 241. Thus, there is a need for the safe 
hmited to, Read Only Memory (ROM) 108 and Random general purpose virtual machine of the present invention to 
Access Memory (RAM) 110 by way of memory bus 152. prevent undesirable and/or unauthorized activity by any 
Another accessible memory device includes non-volatile process without appreciably affecting the overall processing 
memory device 112 by way of local bus 150. User input to performance of processor 102. 

computer system 100 can be entered by way of keyboard 35 General Purpose Virtual Machine Architecture — ^FIG. 3 

104 and/or pointing device 106. Human readable output FIG. 3 illustrates a safe general purpose virtual machine 

from computer system 100 can be viewed on display 114 or in block diagram form. The architecture of the virtual 

in printed form on local printer 115. machine is hardware independent so that the underlying 

More importantly for purposes of the present invention, processor 102 components including, but not limited to, 

computer system 100 is accessible from and has access to 40 input data bus 230, main/cache memory 235, registers/ 

foreign nodes and other remote facilities by way of MODu- decoder 220, MUX 210, ALU 215, and output bus 231, are 

lator DEModulator (MODEM) 113 through public tele- substantially similar if not identical to the processor 102 

phone or cable conomunication line 117, or network com- components discussed in FIG. 2. One fundamental differ- 

munication line 116 which can include, but is not limited to, ence between the virtual machine of FIG. 3 and the common 

a LAN or WAN communication line, an Intranet commu- 4S operating system machine of FIG. 2 is that the virtual 

nication line, an Internet communication line, or any com- machine of FIG. 3 controls the memory access activity of 

bination of the aforementioned. Communication lines 116 processes in main/cache memory 235. The hardware com- 

and 117 facilitate connectivity among multiple potentially ponents of FIGS. 2 and 3 in other respects are otherwise 

diverse computer systems each having potentially incom- substantially similar. 

patible computing environments with each other. 50 The main/cache memory 235 in FIG. 3 includes a virtual 

Hardware Architecture Example — FIG. 2 machine control program or operating system 340, free 

FIG. 2 illustrates a block diagram example of a hardware memory space 243, a local virtual machine program 341, 

architecmre for a processor 102. The figure is presented for and a foreign virtual machine program 342. One key dif- 

discussion purposes only and is in no way intended as a ference between a local virtual machine program 341 and a 

limitation on the variety of specific component configura- 55 foreign virtual machine program 342 is that the local pro- 

tions that exist from processor to processor. gram was generated on the immediate machine and could be 

Components in processor 102 include, but are not limited considered a "trusted" or "safe" program whereas the for- 

to, an input data bus 230 into main and/or cache memory eign virtual machine program originated on another com- 

235, a local processor bus 233 facititating main/cache puter system and for that reason is considered a "distrusted" 

memory connectivity to an instruction/destination decoder 60 or "unsafe" program in terms of its potential to attempt 

and processor registers 220, a MUltipleXer (MUX) 210 unauthorized or undesirable memory accesses beyond its 

having multiple selection line inputs 221 from registers/ own partition of main/cache memory 235. To prevent unau- 

decoder 220, and an Arithmetic Logic Unit (ALU) 215 that thorized and/or undesirable memory accesses from 

selects and performs program operations. Output busses occurring, the safe general purpose virtual machine of the 

231-232 from the ALU 215 distribute processed output to 65 present invention imposes a memory access permission 311 

external and internal destinations respectively. The main/ and 312 for the respective processes 341 and 342, and any 

cache memory 235 is typically a Random Access type memory access by the respective processes 341 and 342 
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must conform to the general purpose memory protection then the foreign program must be made safe beginning with 

model represented by the memory access permission 3tl decision step 427. At decision step 427, if it is determined 

and 312. If a process accesses main/cache memory 235 in that program optimizations are necessary prior to making the 

violation of its memory access permission, program instruc- foreign program safe at step 434, then processing continues 

tion execution is interrupted and control is passed to a s to step 430. Details of step 430 are disclosed in the text 

memory violation handling routine. The memory violation accompanying HG. 6. If it is determined at decision step 427 

handling routine is part of the virtual machine control i^al optimizing the foreign program is not necessary or 

program 340 and the action taken by the handlmg routine otherwise desirable, then processing proceeds direcUy to 

can range from shutting down the violating process to 434 ^^^^^ j^^^. ^ ^ compiled into a safe 

njcrely blocking only the violating action but otherwise ^^^^ ^^^^^^ ^ ^^^^^ 434 ^^^^^^^ 

allowing the process to continue. ^ accompanying FIG. 7. 

Operational Overview — ^FIG. 4 ^ r -I/ L- 

FIG. 4 iUustrates an overview of operational steps for the ^^?nce a safe virtual machme program is generated at step 

safe general purpose virtual machine 400 in flow diagram ^34, the safe virtual machme program can be either loaded 

form. Tlie operational steps begin at step 402 and proceed to and executed at step 436, or transferred or otherwise made 

defining the general purpose virtual machine and the general 15 available to another user on another computmg system at 

puipose memory model for the virtual machine at step 407. step 438. A safe virtual machine computer program that is 

Details of step 407 are disclosed in the text section titled made available to any other user need only be made safe 

Virtual Machine Definition. At step 410, the Memory Access once, and the program can then be marked as being a safe 

Permissions (MAP) and the MAP allocation scheme are program by setting an intemal semaphore to indicate safe or 

defined. Details of step 410 are disclosed in the text section 20 not safe for future reference. One advantage to transferring 

titled Memory Access Permission Definition. Once the safe programs that have already been made safe is that receiving 

general purpose virtual machine is defined, the techniques computers of limited computational resources can execute 

applicable to generating and executing safe virtual machine programs without the overhead of making the programs safe 

computer programs to run in the safe virtual machine prior to execution. Such computers of limited computational 

environment can be addressed. 25 resources can include, but are not limited to, a Personal 

One key feature of the safe general purpose virtual Digital Assistant (PDA), a Personal Communication Service 

machine environment is that users can give and take com- (PCS), an Internet terminal device or network computer, or 

puter programs without concern for hardware and/or soft- a television/cable set-top box adjunct, 

ware compatibiUty issues. For example, the safe general [f at decision step 441 it is determined that additional safe 

purpose virtual machine environment enables a first user 30 virtual machine computer program processing is necessary 

transferring a substantially or fully compiled safe virtual or desired, then processing continues at step 418. 

machine computer program to a second user who can Alternatively, if at decision step 441 it is determined that no 

execute the program. Fundamental hardware and/or pro- additional processing is necessary or desired, then process- 

gramming language compatibility issues arc transparent ing stops at step 445. 

between the two users. Alternatively, the first user can 35 Virtual Machine Definition 

transfer a high-level programming language file to the A definition of the present invention's safe virtual 

second user, and the second user can convert and compile machine includes, but is not limited to, a register model, 

the high-level programming language file into a safe virtual instruction types overview, addressing modes, execution 

machine program that can safely be executed on the second environment, basic assembler syntax, operating system 

user's hardware platform. Again, the fundamental hardware 40 conventions, and instruction set details. Summaries of key 

and/or programming language compatibility issues are components of the above identified virtual machine defini- 

transparent between the two users. From the perspective of tion categories are disclosed in the following text, 

either the first user or the second user, the operational steps As an overall design philosophy, the safe general purpose 

41S-434 of FIG. 4 are substantially the same as discussed virtual machine of the present invention is a software 

below. The level of the remaining discussion related to 45 implemented operating system that relies on operating sys- 

generating a safe virtual machine computer program, tern enforced memory models and general addressing safety 

assumes that any computer program in a high-level pro- features rather than hardware architecture and/or program- 

gramming language has been at least compiled to an assem- ming language memory model enforcement and addressing 

bly language level by either the originating foreign comput- safety features. One object of the safe general purpose 

ing system or the destination local computing system. 50 virtual machine architecture is that it functions as the target 

Each foreign program that is received or otherwise of a safe general purpose virtual machine compiler in a 

accessed by a computer must be evaluated to determine if similar manner as a conventional prior art compiler might 

the program is safe to execute. It is possible that the target a specific hardware architecture. However, unlike 

originating foreign computing system is delivering a pro- computer programs compiled by traditional compilers, vir- 

gram that has already been made safe thus eliminating the 55 tual machine computer programs compiled for the safe 

need for the destination computer to take the processing time general purpose virtual machine of the present invention can 

to make the program safe. Thus, if at decision step 422 it is be safely translated to native hardware instructions or other 

determined that the foreign program is in fact safe, then interpretive strategics can be safely implemented. The safe 

processing continues at step 436 where the safe foreign general purpose virmal machine instruction set is designed 

program is loaded and executed on the receiving computer 60 for easy code translations to a wide variety of hardware 

'without further delay. Details of step 436 are disclosed in the targets including, but not limited to, Complex Instruction Set 

text accompanying FIG. 8. Details of determining if a Computer (CISC) processors such as the Intel x86 series, 

foreign program is safe to execute without further process- and Reduced Instruction Set Computer (RISC) processors 

ing at decision step 422 is disclosed in the text accompa- such as the Digital Equipment Corporation (DEC) Alpha and 

nying FIG. 5. 65 the MIPS series. Although the additional instructions that 

If at decision step 422 it is determined that the foreign are included in virtual machine programs to enforce a 

program is not safe to execute .without further processing, memory protection model by definition are additional per- 
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formance overhead, even a safe virtual machine computer ity and to simplify the task of adding memory protection 

program that is translated to a native hardware instruction when translating a virtual machine computer program to 

set will run at substantially similar native performance native hardware instructions. One convention is that virtual 

levels. machine computer programs should contain only executable 

Note also that the safe general purpose machine virtual 5 code. NoD-executable code such as data should be placed in 

machine architecture does not include a traditional notion of a read/only data segment or the slack or heap. This restric- 

a "protected operating mode" or "protection rings*' because lion allows the operating system to better understand the 

the virtual machine computer programs are executed by a control flow for purposes of effectively translating the 

host program rather than as a standalone executable. Addi- program to native hardware instructions. The second con- 

tionally there are no privileged instructions that are forbid- lO vention is that when a static jump table is used in assembly 

den to "normal" appUcation programs and thus no privileged language level programs, that the table be placed immedi- 

mode because all privileged operations are carried on out- ately after an indirect jump through a table instmction to 

side the safe general purpose virtual machine eovironraent. facilitate safe jumps. 

The safe general purpose virtual machine register model Memory Access Permission Definition 

is substantially similar to a conventional register model that 15 There are three types of memory access permissions that 

includes one set of registers for integer values and one set of apply to a given memory address: READ, WRITE, and 

registers for floating point values. In the preferred embodi- EXECUTE. READ and WRITE permissions carry the tra- 

ment there are 16 integer registers that are 32-bits each and ditional meaning of allowing READ access to a given 

16 floating point registers that are 64-bits each. memory address where a READ permission exists and 

The safe general purpose virtual machine iostruction 20 allowing WRITE access to a given address where WRITE 
types include, but are not limited to, control transfer permission exists. EXECXJTE permission carries the mean- 
instructions, arithmetic instructions, bitwise manipulation ing in the preferred embodiment, that a given memory 
iostructioas, register/memory instructions^ and miscella- address is accessible for the particular purpose of executing 
neous instructions. Hie control transfer type instructions are an instruction located at that address as a machine instruc- 
substantially similar to conventional control transfer instruc- 25 tion. The pennission table itself that contains the memory 
lions although two distinct differences exist. First, the con- access permissions, indicates for each virtual memory 
trol instructions include specific instructions that are useful address supported by the underlying native hardware 
to give clues to the safe general purpose virtual machine whether a foreign computer program is allowed READ, 
translator Second, there are no trap type instructions to WRITE, and/or EXECUTE permission for a given address, 
invoke a privileged or supervisory operational mode. Such 30 Identifying A Safe Program — ^FIG. 5 
privileged modes are unnecessary in the safe general pur- FIG. 5 illustrates the operational details in flow diagram 
pose virtual machine because virtual machine computer form for determining if a program is safe. The verification 
programs can only execute in a user mode. All privileged steps 500 begin at step 502 and are the delails of decision 
operations are carried out by a trusted host application step 422 in FIG. 4. At step 510, the foreign program is 
preferably such as a browser or other underlying operating 35 examined as a sequence of instructions to identify if three 
system program. key invariants hold true. First, every iise of a register is 

The register/memory instruction types are substantially examined to determine that every unsafe instruction in the 

similar to conventional instructions that move data between program uses a dedicated register. An example of a safe 

registers and memory locations, with one exceptioiL Load instruction for an Intel x86 instruction set, for example, 

and store instructions are enhanced to allow a variety of 40 might be add eax,edx where the instruction only modifies 

higher level addressing modes that include, but are not scratch memory that is part of a CPU and does not, by itself, 

limited to, symbol plus register plus offset addresses to alter permanent data in the computer's volatile or non- 

fadlitate generating efGcient code on a variety of hardware volatile memory. Alternatively, an unsafe instruction might 

target platforms. be mov [eax+16],ebx where the instruction modifies the 

Additional miscellaneous instruction types are included in 45 contents of volatile memory at the address cax+16 that is 

the safe general purpose virtual machine to facilitate support sixteen bytes past the contents of machine register eax. If the 

for a variety of hardware target platforms. Examples of the computer program executing the unsafe instruction does not 

miscellaneous instructions include, but are not limited to, a have permission to access location eax-i-16, then allowing 

no-operation instmction, instructions that return specific the unsafe instruction to execute can enable an untrusted 

hardware environment information, and register-to-memory 50 computer program to complete an action that it does not 

location link and release instructions. have permission to perform and potentially execute subse- 

The program execution environment of the safe general quent instructions that are beyond the scope of the programs 
purpose virtual machine supports only a user operation permission- 
mode. In addition, two tables are available for executing If at decision step 514 it is determined that an unsafe 
program use that both contain a sequence of objects that can 55 instruction does not use a dedicated register, then the pro- 
be safely referenced by an executing program. One table is gram is identified as unsafe at step 543 and processing 
the symbol table that contains an indexed list of resolved continues at step 548 as previously disclosed in FIG. 4. If at 
symbols that arc available by direct reference or by a load decision step 514 it is determined that each unsafe instrac- 
conslant value instmction (cnst). The cnsl inslruaion takes lion does use a dedicated register, then processing continues 
an index value and places the address of the associated 60 at decision step 520. 

symbol into a register. Similarly, a second table called the If at decision step 520 it is determined that there is no 

call site table, contains a list of call sites that can be used for safety check between any one instmction in the program 

safe references to functions that are not defined in the flow that sets a register and an instruction that uses that same 

presently executing program. register, then the program is identified as unsafe at step 543 

Finally, two key operating system conventions must be 65 and processing continues at step 548 as previously disclosed 

followed by assembly language programs executing on the in FIG. 4. If at decision step 520 it is determined that there 

safe general purpose virtual machine to improve code qual- is at least one safety check between each instruction in the 
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program flow that sets a register and an instruction that uses instructions are examined to identify safety checks that are 

that same register, the processing continues at decision step related and nested one within the other. Two safety checks 

525. are considered related if they both are positioned in response 

If at decision step 525 it is determined that each safety to the setting or using of a common register. Nested safety 

check in the program is not part of a small set of safety 5 checks are removed from the program at step 618. A 

checks known to the verifying computer system, then the performance improvement is realized by removing nested 

program is identified as unsafe at step 543 and processing safety checks because the program has fewer instructions to 

continues at step 548 as previously disclosed in FIG. 4. A execute at run time. If at decision step 645 another perfor- 

small set of safety checks can include, but is not limited to, mance optimization is desirable, processing continues with 

scanning a set of pre-determined instruction sequences that lO another optimization technique at step 602. If at decision 

can either: 1) convert to a legal address any address that the step 645 it is determined that another performance optimi- 

computer program does not have READ, WRITE, and/or zation is not desirable, processing continues at step 650 in 

EXECUTE permission to access; and/or 2) alter unsafe the manner disclosed in FIG. 4. 

instructions so that they can cause a hardware exception that A second optimization technique 620 is a dedicated stack 

is catchable by a trusted host program such as a browser, if is register technique. Stack addressing is highly stylized 

the original address that would have been used was an illegal because it usually takes the form sp+c where sp is a stack 

address had the computer program remained unaltered. pointer that is preferably a dedicated register and c is a small 

If at decision step 525 it is determined that each safety positive integer. At step 622 the stack pointer sp is assigned 

check in the program is part of a small set of safety checl^ a value and the value is verified as valid at step 625. At step 

known to the verifying computer system, then all three 20 628, only the verified stack pointer is used for loading 

invariants are satisfied and the program is identified as safe instructions and calculating static instruction offsets for the 

at step 543. Processing continues at step 540 as previously program. Using a verified static pointer that never dianges 

disclosed in FIG. 4 with the verifying computer system's restdts in a substantial reduction in safety checks and 

knowledge that the software fault isolation safety checks are therefore a significant increase in program performance at 

properly applied in the program being examined. 25 run time. If at decision step 645 another performance 

Program Optimizing — FIG. 6 optimization is desirable, processing continues with another 

FIG. 6 illustrates the operational details in flow diagram optimization technique at step 602. If at decision step 645 it 

form for optimizing a virtual machine program. If at deci- is determined that another performance optimization is not 

sion step 427 in FIG. 4 the optional program optimizations desirable, processing continues at step 650 in the manner 

are desired, then program optimization is enabled or other- 30 disclosed in FIG. 4. 

wise implemented at step 430. Note that some optimizations A third optimization technique 630 is a local and/or global 

occur prior to, during, and after compile time, and for this optimization of the sequence of instructions actually gener- 

reason the optimization options are disclosed at this point in ated during compile time. Local and global optimization 

the discussion without any intent to suggest a limitation or techniques are commonly used by compilers to improve the 

requirement of that optimizations must occur only at a 35 program flow and/or performance due to specific instruction 

specific point in the processing. combinations, but were heretofore not used in virtual 

The optimizing steps 600 in FIG. 6 are the details of step machine implementations because they require more hard- 

430 in FIG. 4. Note that optimizing a program is not required ware architecture specific information at compile time than 

and different optimization techniques can be enabled or a virtual machine can provide. The presence and/or incor- 

otherwise implemented individually or in combination to 40 poration of annotations results in a performance 

achieve a satisfactory performance range for a program. optimization, however, annotations are not a requirement for 

Reasons why program optimization may not be desirable the implementation of program optimization or safety. The 

include, but are not limited to, too much processing over- effect of annotations on a computer program is to give the 

head to optimize a program, the run time of a program is virtual machine implementation more information about the 

short so that optimizing offers no significant performance 45 nature of an untrusted program such as what parts of the 

benefit, and the target program is not interactive or media program are intended to be executable and what parts are 

intensive so that optimizing offers no significant perfor- intended to be read-only data. The annotation information 

mance benefit. Reasons why program optimization may be does not have to b trusted for safety implementations to 

desirable include, but are not limited to, tuning a program's work however. A virtual machine implementation can use 

performance so that the program is substantially as efficient 50 annotation information to load and run a computer program 

as if it were compiled for a dedicated hardware architecture more efficiently. 

even with the safety checks embedded in the program. The Generating A Safe Virtual Machine Program — ^FIG. 7 

preferred optimization techniques include, but are not lim- One key aspect of the safe general purpose virtual 

ited to, the techniques disclosed in the following text accom- machine is that it is more than merely a system of runtime 

panying FIG. 6. Additional definitions of the terms related to 55 checks or sequences of machine instructions that are nec- 

safety checks and the software fault isolation technique that essary to encapsulate an untrusted program. Instead, the safe 

generates the safety checks, are disclosed in the text accom- general purpose virtual machine is a system that includes, 

panying FIG. 7. but is not limited to, a memory management system, a 

Program optimization selection and/or implementation compiler system, and a program loader, that each coopera- 

begins at step 602 and proceeds with one of three techniques 60 tively support a memory access permission table and seg- 

610, 620, and 630 as instructed by the safe general purpose regate the memory allocated and used by untrusted or 

virtual machine. A first optimization technique 610 is the otherwise unknown computer programs, so that as a whole 

safety check hoisting technique that is implemented after the the SFI system of the safe general purpose virtual machine 

safety checks are inserted into the program. At step 612, the supports a general memory protection model, 

program is examined as a sequence of instructions having an 65 FIG. 7 illustrates the operational details 700 for generat- 

operational flow, to identify blocks of instructions that do ing a safe virtual machine program in flow diagram form, 

not contain decision branches. At step 615, the blocks of Processing begins at step 702 and proceeds to convert each 
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line of the foreign program code into an equivalent line or 
sequence of general purpose virtual machine code at step 
710. A general purpose virtual machine language or code is 
a virtual machine program representation of the original 
foreign program in a form resembling assembly language S 
instructions that is hardware architecture and programming 
language independent. Key features of the instructions that 
comprise the general purpose virtual machine language is 
that the instruction set contains similarities to RISC instruc- 
tions although they include a more complex addressing lo 
scheme. The more complex addressing scheme includes 
flexible memory addressing to satisfy the needs of a variety 
of hardware platforms, enhanced memory protection, and 
higher order instruction enhancements that include, but are 
not limited to, memory block copy instructions and bitfield is 
instructions for direct memory access. An example of a 
bitfield instruction is: 

bfextuv n0,4(nl){26:3} 

where bfext.uv is the symbolic for a bitfield extraction 20 
operation code that extracts bits 26-28 from a word in 
memory located 4 bytes beyond the address in register rl, 
and places the extracted bits in register nO. Including bitfield 
instructions in the general purpose virtual machine instruc- 
tion set facilitates instruction sequences that contain univer- 25 
sally executable code that can be efficiently translated for 
substantially all hardware architectures. In other words, 
bitfield instructions facilitate the conversion of high-level 
programming language statements into virtual machine code 
by code generators, to generate endian-neutral computer 30 
programs. Bitfield instructions make it easier for compilers 
to output programs whose data manipulation instructions are 
neutral with respect to how the underlying native computer 
hardware chooses to lay out primitive data types such as 
integers and floating point numbers in volatile memory. 35 

At step 718, unsafe instructions are identified among the 
virtual machine instructions generated by step 710. An 
unsafe instmction is one that jumps or stores to an address 
that can not be statically verified as being within an autho- 
rized memory segment according to the MAP. For example, 40 
control transfer instmctions such as a program-counter- 
relative branch and store instructions using immediate 
addressing modes, can be statically verified. However, 
jumps based on the contents of a register such as upon return 
firom a procedure, or stores that use a register to hold the 45 
target address, can not be statically verified. 

Each unsafe instruction identified in step 718 is segment 
matched in step 725. Segment matching, also known as 
software encapsulation, is the process of inserting a safety 
check prior to each unsafe instruction. A safety check is one 50 
or more instructions that at run time determine whether the 
target address used by the unsafe instruction has a valid 
memory segment identifier prior to allowing the unsafe 
instruction to proceed. If the memory segment identifier is 
valid then the unsafe instruction is allowed to proceed. If the 55 
memory segment identifier is invalid then the program traps 
and control is tumed over to a system error handling routine 
as illustrated in the text accompanying FIG. 8 steps 
814^30. 

Specifically, segment matching requires that four of the N 60 
registers provided by the underlying hardware architecture 
be reserved as dedicated registers referred to as dedicated- 
regl through dedicated-reg4. A dedicated register is used 
only by the inserted segment matching instructions in the 
following manner: 1) the target address of the unsafe instruc- 65 
tion is moved into dedicated-regl; 2) the memory segment 
identifier bits of the contents of dedicated-regl are right- 
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shifted into dedicated-reg?; 3) the relevant bits of dedicated- 
reg2 are compared with dedicated-reg3 which contains the 
memory segment identifier key; and 4) trap if the compari- 
son is not equal otherwise altered unsafe instruction uses 
dedicated-regl which contains the valid address. 

Address sandboxing is an alternative to the segment 
matching disclosed above in that address sandboxing will 
further reduce run time overhead. To implement address 
sandboxing, safety check instructions are inserted into the 
virtual machine instruction sequence prior to each unsafe 
instruction as was done with the segment matching safety 
check. However, the address sandboxing safety check 
instructions merely set the upper bits of the target address to 
the correct memory segment identifier thereby forcing a 
valid address to exist. Address sandboxing also reduces the 
number of dedicated registers needed to only one by imple- 
menting the following technique: 1) the target address of the 
unsafe instruction is moved into dedicated-regl by ANDing 
the target address with dedicated-reg2 that contains a mask 
that allows the address bits to pass through but clears the 
memory segment identifier bits; 2) the memory segment 
identifier bits of dedicated-regl are set to the assigned 
memory segment identifier by Oring the dedicated-regl 
contents with dedicated-reg3 that contains the assigned 
memory segment identifier mask; and 3) the altered unsafe 
instruction uses dedicated-regl which contains a valid 
address. 

Once the safety checks are inserted into the virtual 
machine program at step 725, additional program optimiza- 
tions can optionally be implemented at step 734 as previ- 
ously disclosed in the text accompanying FIG. 6. Finally, at 
step 748 the remaining safe and optionally optimized virtual 
machine program is converted fi-om assembly language type 
instructions into machine level instructions. The conversion 
is commonly referred to as compiling and the output of the 
compiler is also known as the object code or binary code that 
is understood by the underlying hardware architecture. 
Additional object code size optimizations can be realized 
during the compile step as disclosed in the text section titled 
"Compiler Object Code Size Optimizations" below. 

In one embodiment, the conversion or compiling of step 
748 can be implemented using a template-driven strategy. A 
template-driven strategy is where a set of templates are 
defined that are each responsible for converting a particular 
virtual machine instruction or addressing mode into the 
equivalent machine code for the underlying hardware. Once 
an object code is produced, processing proceeds at step 755 
as previously disclosed in the text accompanying FIG. 4 step 
434. 

Compiler Object Code Size Optimizations 

When storing compiled programs or transferring com- 
piled programs from a first computer to a second computer, 
the size of a compiled program can be a problem. The 
capacity and/or performance of transmission facilities and 
memory can become bottleneck problems for large compiled 
programs. In some scenarios it can be significantly faster or 
efficient to store or transmit compressed object code that can 
later be interpreted or decompressed and executed. This fact 
is self-evident when a large amount of object code is 
transmitted over existing MODEM speeds. This fact can 
also be true even with faster networks or for paging from a 
disk, or even for cache misses if the decompressor is fast 
enough. For these reasons, certain compiler optimizations 
can be implemented that directly target the potential bottle- 
neck areas related to transmissions and memory. 

When transmitting compiled code is the bottleneck, it is 
desirable to implement the best possible compression tech- 
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nique provided that the receiving computer's performance that is competitive with the best Lempel-Ziv (LZ) based 

can afford to expand the compressed code before execution. compression programs. 

An object module that is compressed for this purpose is The BRISC system further includes a compiler that con- 
called a "wire" code because the transmission facility or the verts high-level language programs into sequences of 
"wire'* is the bottleneck. When memory facilities or memory 5 instructions for use in a safe general purpose virtual 
performance is the source of the bottleneck, the code must machine. In one preferred embodiment, the safe general 
be stored and interpreted in an operationally compressed purpose virtual machine includes a RISC instruction set 
form. For purposes of this discussion, assume that the code augmented with macro-instniclions for common operations 
includes jumps and calls so that random access to at least ' such as moving and initializing blocks of data. However, the 
somebasicblocksof code is necessary. Further, if some code lo present compiling/compression system compresses fully 
must be compiled as a program is running, then the Just- linked executable programs containing safe general purpose 
In-Time (JIT) compilation rate must be very high. Finally, if virtual machine programming language instmctions into 
both the transmission facilities and the memory facilities are programs containing BRISC instructions. A network server 
bottlenecks, then one preferred embodiment is to decom- can then transmit the BRISC instructions across a network 
press a wire code into a compressed and interpretable form. 15 to client computers by way of browser programs for 
The compression goals are to implement a compiler example, that also contain an implementation of a safe 
compression technique that suits the specialized problem of general purpose virtual machine. The safe general purpose 
compressing virtual machine programs that can mn on virtual machine can either interpret the BRISC instructions 
multiple hardware platforms, and to determine how to direcdy or convert them to native machine code. In one 
generate compact automata that accurately predict the next 20 preferred embodiment, the present compiling/compression 
instruction operator or operand based on the present context system is operational on several platforms that include, but 
so that commonly occurring tokens are given the shortest are not limited to, Intel xS6/SJ, SPARC/Solaris 2.4, 
compression encodings. A preferred compression technique PowerPC/NT, and PowerPOMacOS. For comparison pur- 
is an interpretable virtual machine compression called Byte- poses only, all measurements disclosed in the present docu- 
coded RISC (BRISC). A BRISC compression results in an 25 ment are based on an Intel Pentium 120 Mhz processor with 
object code that is substantially similar in size to oon- 32 Mbytes of memory running the Microsoft NT 4.0 work- 
interpretable gzipped CISC programs and supports both station, 
client-side and server-side compilation. Server-side compi- A. BRISC Generation 

lation is necessary to efficiently deliver large application Because BRISC code is interpretable in the preferred 

programs from a first computer to a second computer. For 30 embodiment, the BRISC design is constrained to ensure that 

example, existing JIT compilers do not allocate registers instructions occur on byte boundaries. Thus, where the 

until the native machine target is identified. However, by split-stream compression techniques described above would 

performing code optimization and register allocation before have used 2-3 bits per opcode, BRISC always uses even 

a program is downloaded or otherwise transferred, a pre- boundary 8 or 16 bits per opcode. However, BRISC makes 

compiled mobile code transmission system dramatically 35 up for the increased opcode size by packing more informa- 

reduccs the time required to generate a final machine or tion into each opcode through the operand specialization and 

object code on the native machine target. opcode combination techniques. 

Further, BRISC compressed code can be interpreted at or B. Operand SpeciaUzation 

about a 12xtime penalty while cutting the overall working Generally speaking, operand specialization has been 

set size of the resulting object code by over 40%. 40 described as "burning in" a particular value for one or more 

Alternately, BRISC compressed code can be compiled at of the fields of a pattemized instruction. A more concrete 

over 2.5 megabytes per second to produce executable CISC- description of operand specialization is disclosed below 

type machine code at a rate that is at least or about 100 times using specific examples. For example, consider the general 

faster than conventional JIT compilers that are commercially purpose virtual machine program instruction Id.iw n0,4(sp). 

available in the industry. This high compilation rate permits 45 The operational purpose of this instruction is to load the 

the recompiling of a program prior to each execution for 32-bit word at address sp+4 into register nO. The .iw sufBx 

users that have severe local disk cache constraints. In on this instruction indicates that this is the 32-bit integer 

addition, the typical delivery time delay from a network or version of the instruction. In fact, this particular instruction 

disk can mask some or even all of the recompilation time is one of the most frequently occurring instruction types 

and the resulting code will run within 1.08 times of the speed so used in benchmark programs. To investigate possible spe- 

of fiilly optimized machine code generated, for example, by cializations of this instruction, the instruction is pattemized 

the Microsoft \^sual C++ 5.0 compOer. BRISC compressed into the following set of instructions from least general (1) 

code can also be used to reduce memory requirements for to most general (8). 

large desktop applications and to compress programs to fit 1. Id.iw n0,4(sp) 

within the memory requirements of embedded systems. 55 2. Id.iw *,4(sp) 

A preferred BRISC code compression implementation 3. Id.iw n0,4(*) 

includes two techniques that yield a dense randomly acces- 4. Id.iw nO,*(sp) 

sible program representation. The first technique is the 5. Id.iw *,4(*) 

operand specialization and the second technique is the 6. Id.iw *,*(sp) 

opcode combination. Both techniques exploit a tree com- 60 7. Id.iw nO,*(*) 

pression technique, however, instead of physically separat- 8. Id.iw *,'(*) 

ing streams of instruction information, the operand special- The most general instruction pattern (8) is part of a base 
ization and opcode combination techniques quantize the instruction set. When a base instruction is written in pat- 
representation of the streams by packing them into a ran- temized form as above, asterisks are placed in each field 
domly accessible stream of discrete byte codes. Thus, 65 position of the instruction to indicate that the base instnic- 
BRISC can support just-in-time code generation, for tion pattern can take on any legal field value in a given field 
example, at or about 2.5 MB/sec while yielding code density position. For example, writing a base integer register move 
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instruction as movi indicates that each of the instruc- the input program several times, generating candidate 

tion's fields can take on any vahie that is legal for the field's instruction patterns and estimating their program size reduc- 

type. In the case of the movi instruction, both of fields tion P and their cost in decompression memory usage 

can take on any value from oO through nl5 in a general working set W. The program size reduction P equals the 

purpose virtual machine having 16 integer registers. 5 reduction in compressed program bytes that would occur if 

Because Idiw a0,4(sp) was the most frequendy occurring candidate instruction pattern were added to the dictio- 

input instniction occurring in benchmarks for the present ^ary minus the number of bytes needed to represent the 

discussion, many of the speciaHzcd forms of this instruction instruction pattem in the dictionary. The decompressor for 

have been added to a dictionary of possible instruction g^j^^ uses a table of native instruction sequences for 

pattcrns. By doing so exphatiy representing common oper- 10 i^teipretation or native code generation. For example, the 

ands such as nO and 4 are avoided For example, when the ^^^^^^^ decompn^ssor's memory usage 

compressor encounters mstruction (1) during an mput scan, ^ _ . ^. , . ..... 

instmction pattems (5)-<7) are generated in response as for a given dictionary entry by averagmg the size m 

potential dictionary entries for commonly occurring instruc- ^7"^ decompression table instruction sequences for a 

tions. To arrive at a two-operand-specialized instruction 15 given processor platform such as the Pentium and PowerPC 

pattern such as Id.iwn0,4(*), the compressor first adds Id.iw ^^1 processor platforms. Thus, the benefit B of a given 

nO,*(*) or Id.iw *,4(*) to tiie dictionary. The input program instruction pattern equals P-W. Alternatively, in abundant 

is then modified to reflect the presence of the new instruction memory situations, B and P can be equal, 

pattem (5)-(7). On a subsequent pass over the input The compressor maintains a heap of candidate instruc- 

piogram, the compressor can add to the frequently occuning 20 tions that are sorted by B. After each pass over the input 

instruction dictionary by inchiding a more specialized ver- program, the compressor removes the K best candidates 

sion of instruction pattem (5)-(7) by incorporating addi- from the heap and adds tiicm to the frequendy occurring 

tional fields. To denote an input instruction that has been instruction dictionary. The compressor then modifies the 

converted to use an operand^pecialized instruction pattem, ^^^^ program lo reflect the newly avaflable insu^ction 

the instruction pattem is first enclosed m square brackets and 25 pattems by first considering each pair of instructions tiiat can 

then followed by a list of tiie literal values to be substituted combined by a new opcode-combined instruction pattem. 

mto the unspecified fields of the mstmction pattern. The ^„ ^^^^ ..^ ;«c*«ir.t,Vv,,\,ot»«^ 

- Jr . , , , , , . , „ ^ 1 r t)n each pass, there can only be one new mstruction pattem 

unspecified fields are denoted by asterisks. For example, II ^, ^ v ^ i • au* • * 

. rc\ ^ ^ • J ' * /\\ that apphes to a particular pair. After the mstmction com- 

instmction pattern (5) IS derived from mput mstmcUon (1), ^z*^ , . ^ i . ^, 

then ihe input instrucion can be rewritten as Ddiw 'A*) : 30 ^"^^^^ >s complete, the compressor modifies aU 

nO so I. V /J instmctions m the mput program that can be represented 

C Opcode Combination compactly using one of the new instmction pattems. 

' Ihe compressor also generates candidate instmction pat- To avoid undue overhead in updating die input program, the 
terns through the opcode combination technique where compressor maintains a table that maps each base insUiiction 
adjacent pairs of opcodes are candidates for opcode combi- 35 pattern to a list of all input program instructions matching 
nation. For example, if tiie input program contains the Uiat pattem. Similarly, to avoid generating candidate instruc- 
sequence of instructions [Id.iw n0,*(*)]:4,sp; mov.i n2,n0, tion patterns that have akeady been generated, the compres- 
the instmction pattern <[Id nO,*(*)],[mov.i ,]> becomes a sor maintains a hash table of previously generated 
candidate for addition into the base instmction set. The candidates, keyed by base instruction pattems and special- 
angled brackets in the present example denote instmction 40 ized field values. 

pattems that result from combining opcodes. compressor ceases to hunt for useful instmctions after 

Because BRISC is quantized, not aU instmction combi- ^ p^^^ y^^j^ ^^^^ ^ instructions for which B 
nations make sense. If a combined instmcUon leaves a is positive. Thus the compressor uses a greedy algorithm for 
traihng sub-byte operand^ the compressor can defer com- ^^^^ dictionary. The optimal algorithm would con- 
bmmguntilfortiierspecializ^^^^^ 45 ^.^^^ .^^^ dictionaries and their effect on 
the combination of unspecified operands from the adjacent . i. . u u uu*- i 
instructions can facTitTte optimal packing of combined '=°°>Pf"«i°n. but this would be prohibitively time- 
operands into whole mmibers of bytes. The compressor ^o^^^'^^f- To perform dictionary encoding, Uie compressor 
generatesascandidateinstmctionpatternsaotonlyeachpair "f^aorder-lsemi-static Markov m^^^^ 
of adjacent instructions <ij>, but every possible pair con- 50 f « "t^^' words both the compressor and/or 
sisting of a zero or one-field operand specialization of i mpr«»«or can build a table for each possible instruction 
followed by a zero or one-field operand socialization of j. P*"*™ } fha enumerates the instruction patterns that can 
This ensur4 that the operand specialization technique will ^ °^ ] "'P'^^- " ^°J^ ^!^, ^56 mstruclions can 
not compete with the opcode combination technique by ^^'1"^' ^?^^ ^ instmcUon pat- 
further specializing an instruction before the combiner has a 55 ^'^P^'^' dicUonary for one preferred embodi- 
chance to consider a less-specialized version. 1°*'"' P."1P^f . 

Opcode combination captures common code generation implementing Ice, «|an contain 981 mstructton patterns. Each 

idioms. For example, data movement instructions such as "«"™ction pattern has at most 244 msteuction pattems tha 

J r .t »ur can follow It. There IS a soecial context m the Markov model 

Id.iw and mov.i frequently occur to set up parameters before 7 . . 1 i J • '^l'^^'*"^""^^^ '""^ , 

1, T^. . J • c *t. * ^« for basic block bcemnmes of various types so that the 

call instructions. This is a quantized version of the tree 60 b"^^^b^ * 

^.„,^t' A^„^ tu^ BRISC program remams interpretable. Once the compressor 

construction done m the previous section. , . ^ ...... j- ^- r ii j 

D. BRISC Generation Algorithm t'^.u '^■^ dictionary, it ou^ute the dictionary foUowed 

The compressor begins with the base instiuction set, by the modified mput program that it has compressed dunng 

including 224 instmction patterns in the present example, dictionary cons ction. 

and adds to the instmction set to create a dictionary of 65 A BRISC Compression Example 

frequently occurring instmction pattems. To find useful The safe general purpose virtual machine compiler gen- 

instmctions to add to the dictionary, the compressor scans erates the following sequence of virtual machine program 
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language instructions for the example program intioduced ia dictionary entry for [enter sp**] is dominated by the 

the wire format discussion above. sequence of native instructions that will be generated by the 

enter sp^sp,24 decompressor to generate code for this instruction. For 

spill .i n4,16(sp) just-in-timc conversion to Pentium instructions, the instruc- 

spill.i ra^O(sp) 5 tion space required is 17 bytes, and on a PowerPC 601 the 

mov.i n4,D0 instruction space requires 28 bytes. Averaging these yields 

mov.i n2,Dl Wo25 for [enter sp,*,*]. This instruction pattern saves one 

blc.i n4,0,$L56 byte over the original input program. One input instruction^ 

mov.i nl,n4 [enter sp,sp,24] would be represented in 2 bytes instead of 

mov.i D0,n2 lO 3 bytes because the remaining field values, sp and 24, can be 

call „pepper compacted into a single operand byte. However, the program 

$L56: size reduction P is the 1 byte saved minus the 2 bytes of 

add.i n0,n4,-l dictionary entry. Because the overall benefit B-P~W— 26, 

reload.i n4,16(sp) this instruction pattern would not be added to the dictionary. 

reload.i ra,20(sp) 15 Due to the size of the present program and the code 

exit sp;5p^4 generation and/or interpretation table costs W, none of the 

rjr ra candidate instructions are suitable so that the input program 

For this input program, the initial dictionary is the set of remains. For a larger input program, however, the benefits of 

base instructions including {enter, spilLi, mov.i, ble.i, call, operand specialization and opcode combination outweigh 

addi, reload.!, exit, and rjr}. Because this program is small, 20 the instruction table costs. To illustrate this, an example 

few opportunities exist for instruction combination or spe- dictionary is applied to the example program. The resulting 

cialization. However, the program is useful to illustrate some compressed program is Listed below, 

basic steps of BRISC compression in the first three instruc- <[enter_x4 sp,sp,*]£spill.i_jx4 n4,*(sp)],[spill.i_x4 ra,* 

tions of the program. Applying operand specialization to the (sp)]>: ^74,5 

first three instructions generates following candidate spe- 25 <[mov.i *,nO],[mov.i *,nl]>:n4, n2 

cializations in the first pass of the BRISC algorithm: [ble.i *,0,*]: n4, $L56 

1. [enter sp,*,*] <movi nl,n4],[mov.i n0,n2]> 

[enter VP,*] $^6^'^^'' 

[enter *,*,24] 3^ [sub_It32.I n0,*,*]:n4, 1 

2. [spill.in4,*(*)] epi 

[spill.i *,16(*)] The angled brackets indicate opcode combinations. Also, 

[spill.i *,*(sp)] if ^ instruction contains unspecified fields, here denoted by 

3. [spill.i ra,*(*)] asterisks, the unspecified field is followed by a colon and 
rsoill i * 20f ' ^ ordered literal values that can be inserted into 
Note that one candidate speciaUzation of insttuction 3, V^i&ci fields. sufBx indicates that imme- 

spiU.i '.'(sp), has already been generated by applying diate vidu^ are multipbed by four, 

operand specialization to instruction 2. For each instruction, ^he final instmcUoD of th^ sequence, ep., is a special- 

the set of candidate instructions generated through operand macro-instrucUon and the only such instruction used m 

specialization is called that instruction's operand- 40' the compressor. The semantics of the macro-instruction are 

specialized set If the corresponding base instruction pattern ^ ^« 5""^°' f"°="°"> callee^ved repters, 

is added to the operand-specialized set for a given input frame and return in a normal manner using the 

instruction I, the augmented opcrand-speciaHzed set is con- ">s«niction. All other dictionary entnes are generated 

structed of candidate instruction patterns for I. To apply «!"°"fb f ^e^ ^P^"^ speaal^auon or opcode combina- 

opcode combination to instructions 1 and 2. 16 pairs of « l^""- ^ ^^^^ ?^ » ^ 

instruction patterns are generated by selecting one element mputpropamwasfiO and the resu^tmg compressed program 

&om instruction I's au^cntcd operand-specialized set of 17 by^es. 7 bytes forinstruction opcodes and lObytes 

candidates and one element from instruction 2*s augmented z^^ pacRea luerais. 

operand-speciaUzed set of candidates: ! ^ . .- , ^ . •. 

, **ir 11 ■ 4 FIG. 8 illustrates the load and execute operational details 

Renter sp,^, J,LspUl.i n4, UJ> diagram form for a safe virtual machine pro- 

<{enter sp,*,*],[spill.i *,16(*)]> gram. The safe virtual machine loading and executing begins 

<enter sp,*,*],[spill.i *,*(sp)]> at step 802 and proceeds to generate and load a binary 

<[enter *,sp,* ],[spill.i n4,*(*)]>etc. equivalent of the safe virtual machine program into memory 

The total set of candidate instruction patterns generated 55 of the client/local computer system to begin execution. The 

by instmctions 1 and 2 for the example program would be MAP for the executed program is set at step 808 as a 

the 16 candidates generated through opcode combination permission table set up by the virtual machine's memory 

and the 6 candidates generated through opcode speciahza- management system. Run time for the loaded virtual 

tion. Because the total set of base instmction patterns is only machine program begins at step 814 when control of the 

224, however, the total number of candidates generated by 60 processor is turned over to the virtual machine program. If 

a large program remains manageable. at decision step 818 it is determined that the safe virtual 

A cost-benefit metric operation can be illustrated by machine program violates the assigned MAP, then a pro- 

application to a candidate instruction such as [enter sp,*,*]. gram interrupt occurs and processing continues in an inter- 

The file size cost of a dictionary entry for [enter sp,*,*] is 2 mpt handling routine at step 824. If it is determined at 

bytes, 1 byte to indicate the base instruction, enter, 2 bits to 65 decision step 818 that the safe virtual machine program has 

indicate which field is specialized, and 4 bits to set the not violated the assigned MAP, then processing continues at 

specialized value for that field. The working set cost of a decision step 830. At dedsion step 830, the program could 
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be allowed to continue at step 818 evea if a MAP violatioa 
were encountered and blocked. Alternatively, at decision 
step 830, the program can be terminated and processing 
continues at step 835 in a manner previously disclosed in 
FIG. 4. 
Summary 

The safe general purpose virtual machine computing 
system includes a general purpose virtual machine imple- 
mentation that generates and/or verifies and executes safe 
virtual machine programs. Safe virtual machine programs 
that are stored and/or generated prior to transmission from a 
first virtual machine to a second virtual machine can be 
compressed prior to transmission to significantly reduce the 
size of the program being transmitted. 

Although specific embodiments of the present invention 
are disclosed herein, it is expected that persons skilled in the 
art can and will design alternative safe general purpose 
virtual machine systems that are within the scope of the 
following claims cither literally or under the Doctrine of 
Equivalents. 

What is claimed is: 

1. A computer readable medium containing computer 
executable instructions to perform a method for implement- 
ing a safe general purpose virtual machine, said method 
comprising: 

defining said safe general purpose virtual machine having 
a general purpose memory protection model that does 
not rely or depend upon a specific hardware architec- 
ture or programming language feature for memory 
protection, at least one memory access permission 
based on said memory protection model, a bytecode 
reduced instruction set computer compiler wherein said 
bytecode reduced instructions occur only on byte 
boundaries, and a plurality of reduced instruction set 
computer virtual machine instructions selected from at 
least one of a group of types comprised of: control 
transfer instructions, arithmetic instructions, bitwise 
manipulation instructions, register/memory 
instructions, and miscellaneous instructions; 

generating a safe virtual machine program from a source 
program that contains at least one unsafe instruction, 
wherein said safe virtual machine program is hardware 
architecture independent and programming language 
independent; 

optionally compressing said safe virtual machine program 
pursuant to a bytecode reduced instruction set computer 
compression system, wherein said bytecode reduced 
instructions occur only on byte boundaries; and 

executing said safe virtual machine program within a 
memory space defined by said memory protection 
model and said at least one memory access permission. 

2. A method according to claim 1 wherein said step of 
generating includes: 

identifying a first set of program instructions within said 
source program capable of converting a legal memory 
address into a memory address prohibited by said 
memory access permissions; 

identifying a second set of program instructions within 
said source program that are capable of jumping to a 
memory address prohibited by said memory access 
permissions; 

preassigning explicit register allocations for each sym- 
bolic reference in said source program; and 

altering said first set of program instructions and said 
second set of program instructions to cause a trap 
catchable by a trusted program host. 
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3. A method according to claim 1 wherein said step of 
optionally compressing includes: 

determining a compression benefit for said safe virtual 
machine program; and 

implementing a dense randomly accessible stream of 
bytecodes using a bytecode reduced instruction set 
computer compression on said safe virtual machine 
program in positive response to said step of determin- 
ing. 

4. A method according to claim 3 wherein said step of 
determining includes: 

generating a list of repeated instruction patterns in said 
safe virtual machine program to add to an instruction 
dictionary; 

estimating a program size reduction P to said safe virtual 
machine program based on a record of said list of 
repeated instruction pattems in said instruction dictio- 
nary; 

identifying a number of bytes W required to represent said 
list of repeated instruction pattems in said instruction 
dictionary; and 

determining a compression benefit B based on a weighted 
evaluation of P and W. 

5. A method according to claim 4 including: 
determining said compression benefit B based on an 

arithmetic difference of P and W; and 
compressing said safe virtual machine program in 
response to a positive value of B. 

6. A method according to claim 4 including: 
compressing said safe virtual machine program in 

response to a positive value of P. 

7. A method according to claim 4 wherein said step of 
implementing includes: 

maintaining a heap of candidate compression instructions 
from said list of repeated instruction pattems; 

adding distinct instruction combinations of said list of 
repeated instruction patterns in said heap to said 
instruction dlctioaary; 

associating a unique opcode-combined instruction with 
each of said distinct instruction combinations; and 

substituting said tmique opcode-combined instruction 
with a corresponding one of said distinct instruction 
combinations in said safe virtual machine program. 

8. A method according to claim 7 wherein said step of 
maintaining includes: 

hashing said heap to identify repeated ones of said list of 

repeated instruction patterns; and 
updating said instruction dictionary with only one copy of 

each of said distinct instruction combinations. 

9. A method according to claim 1 wherein said step of 
optionally compressing includes: 

generating operand specialization compression on 

repeated instruction operands within said safe virtual 

machine program; and 
generating opcode combination compression on repeated 

instmction pattems within said safe virtual machine 

program. 

10. A method according to claim 9 wherein said step of 
generating operand specialization compression includes: 

adding commonly repeated instructions to a repeated 

instruction dictionary; and 
implicitly representing repeated operands from among 

said commonly repeated instructions in said repeated 

instruction dictionary. 
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11. A method according to claim 9 wherein said step of 
generating opcode combination compression includes: 

adding repeated pairs of adjacent instruction combina- 
tions to a repeated instruction dictionary; 

generating a single meta-instruction for each unique one 
of repeated pairs of adjacent instruction combinations; 
and 

combining pairs of zero-field and one-field instructions to 
maximize optimal byte boundaries of densely packed 
instructions. 

12. A method according to claim 11 wherein said step of 
combining pairs includes: 

delaying said step of combining until a maximum number 
of instruction pairs are available to maximize a selec- 55 
tion of optimal byte boundaries of densely packed 
instructions. 

13. A method according to claim 1 wherein said step of 
executing includes: 

verifying that said safe virtual machine program is firee of 20 
unsafe instructions, hardware architecture 
dependencies, and programming language dependen- 
cies; 

assigning a memory access permission to said safe virtual 

machine program; 
generating a program execution interrupt to trap said safe 

virtual machine program at a time a memory address is 

accessed other than a memory address allowed by said 

assigned memory access permission; 
generating a native machine executable program from 

said safe virtual machine program just-in-time to 

execute said native machine executable program; and 
executing said native machine executable program on a 

native machine hardware host. 

14. A method according to claim 13 wherein said step of 
verifying includes: 

identifying a safe program instruction from and unsafe 
program instruction within said safe virtual machine 
program; and 

generating a new safe virtual machine program in nega- 
tive response to an xmsafe program instruction within 
said safe virtual machine program. 

15. A method according to claim 1 wherein said step of 
generating includes: 

converting said source program into a sequence of virtual 
machine program instructions; 

identifying ones of said sequence of virtual machine 
program instructions capable of accessing memory 
addresses other than said memory addresses defined by 
said at least one memory access permission; and 

modifying said virtual machine program with safety 
check instructions to prevent access to memory 
addresses other than said memory addresses defined by 
said at least one memory access permission, wherein 
said virtual machine program with modifications is said 
safe virtual machine program. 

16. A method according to claim 15 wherein said step of 
modifying includes: 

inserting dedicated register segment matching instruc- 
tions for at least one of said sequence of virtual 
machine program instructions capable of accessing 
memory addresses other than said memory addresses 
defined by said at least one memory access permission. 

17. A method according to claim 15 wherein said step of 
modifying includes: 
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inserting dedicated register address sandboxing instruc- 
tions for at least one of said sequence of virtual 
machine program instructions capable of accessing 
memory addresses other than said memory addresses 
defined by said at least one memory access permission. 

18. A method according to claim 1 including: 
implementing run time performance enhancing optimisui- 

tions on said safe virtual machine program. 

19. A method according to claim 18 wherein said imple- 
menting includes: 

removing related non-essential safety check instructions 
from said safe virtual machine program, wherein said 
related safety check instmctions are both positioned in 
response to the setting or using of a common register 
and prevent unauthorized access to memory addresses 
other than said memory addresses defined by said at 
least one memory access permission. 

20. A method according to claim 18 wherein said imple- 
menting includes: 

improving said safe virtual machine program logic flow. 

21. A method comprising: 

defining a safe general purpose virtual machine having a 
general purpose memory protection model that does not 
rely or depend upon a specific hardware architecture or 
programming language feature for memory protection, 
at least one memory access permission based on said 
memory protection model, a bytecode reduced instruc- 
tion set computer compiler wherein said bytecode 
reduced instructions occur only on byte boundaries, 
and a plurality of reduced instruction set computer 
virtual machine instmctions selected from at least one 
of a group of types comprised of: control transfer 
instructions, arithmetic instmctions, bitwise manipula- 
tion instmctions, register/memory instructions, and 
miscellaneous instructions; 

generating, on a first computer, a safe vhmal machine 
program from a source program that contains at least 
one unsafe instruction, wherein said safe virtual 
machine program is hardware architecture independent 
and programming language independent; 

optionally compressing said safe virtual machine program 
pursuant to a bytecode reduced instruction set computer 
compression system; 

transferring said safe virtual machine program from said 
first computer to said second computer; and 

executing said safe virtual machine program within a 
memory space defined by said memory protection 
model and said at least one memory access permission 
on a second computer that is independent from and 
network accessible to said first computer, and said first 
computer is hardware architecturally distinguishable 
from said second computer. 

22. A method according to claim 21 wherein said step of 
generating includes: 

identifying a first set of program instructions within said 
source program capable of converting a legal memory 
address into a memory address prohibited by said 
memory access permissions; 

identifying a second set of program instmctions within 
said source program capable of jiunping to a memory 
address prohibited by said memory access permissions; 

preassigning explicit register allocations for each sym- 
bolic reference in said source program; and 

ahering said first set of program instructions and said 
second set of program instructions to cause a trap 
catchable by a trusted program host on said second 
computer. 
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23. A method according to claim 21 wherein said step of 
executing includes: 

verifying that said safe virtual machine program is free of 
unsafe instructions, hardware architecture 
dependencies, and programming language dependen- 
cies; 

assigning a memory access pemiission to said safe virtual 

machine program; 
generating a program execution interrupt to trap said safe 

virtual machine program at a time a memory address is 

accessed other than a memory address allowed by said 

assigned memory access permission; 
generating a native machine executable program from 

said safe virtual machine program just-in-time to 

execute said native machine executable program; and 
executing said native machine executable program on 

native machine hardware of said second computer. 

24. A method according to claim 23 including: 
decompressing said safe virtual machine program. 

25. A method according to claim 23 wherein said step of 
verifying includes: 

distinguishing a safe program instmction from and unsafe 
program instruction within said safe virtual machine 
program; 

generating a new safe virtual machine program in nega- 
tive response to an unsafe program instruction within 
said safe virtual machine program. 

26. A system for executing a safe virtual machine program 
in a safe general purpose virtual machine, said system 
comprising: 

a first computer communicatively connected to a second 
computer, said first computer being operationally inde- 
pendent from said second computer; 

means for defining a safe general purpose virtual machine 
having a general purpose memory protection model 
that does not rely or depend upon a specific hardware 
architecture or programming language feature for 
memory protection, at least one memory access per- 
mission based on said memory protection model, a 
bytecode reduced instruction set computer compiler 
wherein said bytecode reduced instructions occur only 
on byte boundaries, and a plurality of reduced instruc- 
tion set computer virtual machine instructions selected 
from at least one of a group of types comprised of: 
control transfer instructions, arithmetic instructions, 
bitwise manipulation instructions, register/memory 
instructions, and miscellaneous instructions; 

means for generating, on said first computer, a safe virtual 
machine program from a source program that contains 
at least one unsafe instruction, wherein said safe virtual 
machine program is hardware architecture independent 
and programming language independent; 

means for optionally compressing said safe virtual 
machine program pursuant to a bytecode reduced 
instruction set computer compression system; 

means for transferring said safe virtual machine program 
from said first computer to said second computer; and 

means for executing said safe virtual machine program 
within a memory space defined by said memory pro- 
tection model and said at least one memory access 
permission on said second computer. 

27. A system according to claim 26 wherein said means 
for generating includes: 

means for identifying a first set of program instructions 
within said source program capable of converting a 
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legal memory address into a memory address prohib- 
ited by said memory access permissions; 
means for identifying a second set of program instructions 
within said source program capable of jumping to a 
memory address prohibited by said memory access 
permissions; 

means for preassigning explicit register allocations for 
each symbolic reference in said source program; and 

means for altering said first set of program instructions 
and said second set of program instructions to cause a 
trap catcbable by a trusted program host on said second 
computer. 

28. A system according to claim 26 wherein said means 
for executing includes: 

means for verifying that said safe virtual machine pro- 
gram is free of unsafe instructions, hardware architec- 
ture dependencies, and programming language depen- 
dencies; 

means for assigning a memory access permission to said 
safe virtual machine program; 

means for generating a program execution interrupt to 
trap said safe virtual machine program at a time a 
memory address is accessed other than a memory 
address allowed by said assigned memory access per- 
mission; 

means for generating a native machine executable pro- 
gram from said safe virtual machine program jusi-in- 
time to execute said native machine executable pro- 
gram; and 

means for executing said native machine executable pro- 
gram on native machine hardware of said second 
computer. 

29. A system according to claim 2S including: 

means for decompressing said safe virtual machine pro- 
gram. 

30. A system according to claim 28 wherein said means 
for verifying includes: 

means for distinguishing a safe program instruction from 
and unsafe program instmction within said safe virtual 
machine program; and 

means for generating a new safe virtual machine program 
in negative response to an unsafe program instruction 
within said safe virtual machine program, 

31. A computer readable medium contciining computer 
executable instmctions to perform a method for implement- 
ing a safe general purpose virtual machine, said method 
comprising: 

defining said safe general pucpose virtual machine having 
a general purpose memory protection model, at least 
one memory access permission based on said memory 
protection model, a bytecode reduced instruction set 
computer compiler, and a plurality of reduced instruc- 
tion set computer virmal machine instmctions compris- 
ing: non-trap type control instmctions that operate only^ 
in user mode and do not invoke a privileged or super- 
visory mode, arithmetic instructions, bitwise manipu- 
lation instructions, register/memory instmctions that 
include one or more load and store instructions that are 
enhanced to allow a variety of higher level addressing 
modes to facilitate generating eiBEcient code of a variety 
of hardware target platforms, and miscellaneous 
instmctions that include a no -operation instruction, one 
or more instructions that return specific hardware envi- 
ronment information, and one or more register-to- 
memory location litik and release instructions; 
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generating a safe virtual machine program from a source 
program that contains at least one unsafe instruction, 
wherein said safe virtual machine program is hardware 
architecture independent and programming language 
independent; 

optionally compressing said safe virtual machine program 
pursuant to a bytecode reduced iostructioa set computer 
compression system; and 

executing said safe virtual machine program within a 
memory space defined by said memory protection 
model and said at least one memory access permission. 

32. A method for compressing a safe virtual madiine 
program, which is hardware architecture independent and 
programming language independent, said method compris- 
ing: 

generating a list of repeated instruction patterns in said 
safe virtual machine program to add to an instruction 
dictionary; 

estimating a program size reduction P to said safe virtual 20 
machine program based on a record of said list of 
repeated instruction patterns in said instruction dictio- 
nary; 

identifying a number of bytes W required to represent said 
list of repeated instruction patterns in said instruction 25 

dictionary; 

determining a compression benefit B based on a weighted 
evaluation of P and W; and 

implementing a dense randomly accessible stream of 
bytecodes using a bytecode reduced instruction set 
computer compression on said safe virtual machine 
program in positive response to said step of 
determining, wherein said bytecode reduced instruc- 
tions occur only on byte boundaries. 

33. A method according to claim 32 including: 
determining said compression benefit B based on an 

arithmetic difference of P and W; and 
compressing said safe virtual machine program in 
response to a positive value of B. 

34. A method according to claim 32 including: 
compressing said safe virtual machine program in 

response to a positive value of P. 

35. A method according to claim 32 wherein said step of 
implementing includes: 

maintaining a heap of candidate compression instructions 
from said list of repeated instruction patterns; 
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adding distinct instruction combinations of said list of 
repeated instruction patterns in said heap to said 
instruction dictionary; 

associating a umique opcode-combined instruction with 
each of said distinct instruction combinations; and 

substituting said unique opcode-combined instruction 
with a corresponding one of said distinct instruction 
combinations in said safe virtual machine program. 

36. A method according to claim 35 wherein said step of 
maintaining includes: 

hashing said heap to identify repeated ones of said list of 

repeated instruction patterns; and 
updating said instruction dictionary with only one copy of 

each of said distinct instruction combinations. 

37. A method for compressing a safe virtual machine 
program, which is hardware architecture independent and 
programming languagp independent, said method compris- 
ing: 

generating operand specialization compression on 
repeated instruction operands within said safe virtual 
machine program including: 

adding commonly repeated instructions to a repeated 
instruction dictionary; and implicitly representing 
repeated operands from among said commonly 
repeated instructions ia said repeated instruction 
dictionary; 

generating opcode combination compression on repeated 
instruction patterns within said safe virtual machine 
program, including: 

adding repeated pairs of adjacent instruction combina- 
tions to a repeated instruction dictionary; 

generating a single meia-instruclion for each unique one 
of repeated pairs of adjacent instruction combinations; 
and 

combining pairs of zero-field and one-field instructions to 
maximize optimal byte boundaries of densely packed 
instmctions. 

38. A method according to daim 37 wherein said step of 
combining pairs includes: 

delaying said step of combining until a maximum number 
of instruction pairs are available to maximize a selec- 
tion of optimal byte boundaries of densely packed 
instmctions. 
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